BlackByte Ransomware – Pt 2. Code Obfuscation Analysis
In Part 1 of our BlackByte ransomware analysis, we covered the execution flow of the first stage JScript launcher, how we extracted BlackByte binary from the second stage DLL, the inner workings of the ransomware, and our decryptor code. In this blog, we will detail how we analyzed and de-obfuscated the JScript launcher, BlackByte’s code, and strings.
De-obfuscating the JScript Launcher
We received the original launcher file from an incident response case. It was about 630 KB of JScript code which was seemingly full of garbage code – hiding the real intent.
Our first approach to de-obfuscate the script was to simply scroll through the whole length of this obfuscated code, find some interesting blocks and figure out if there were any eval() function calls. We wanted to find an eval() call because this is where the script likely executes relevant code.
As seen in the screenshot below, we found that the first hundred lines of code were mostly unused, garbage code. At line 2494 is a blob of seemingly Base64 encoded strings (which turned out to be the main payload). Then at line 7511 is a lone eval() call.
The next step was to trace back the code beginning from the eval() call at line 7511, finding the references to the eval’s parameter variable name – “bnlpgh”, then start following the flow and references until we obtained the real code.
Here is an initial flow we followed starting from the eval() call.
Following the code in this fashion, we were able to distinguish the real code from the garbage. We then prettified the code and renamed the variables to be readable. The code snippet below reveals the first layer:
Above you may see in the first layer code our renamed variable - secondLayerEncoded - this is a string that looks like it was encoded in Base64. Although that is true, it is a Base64 string that has been reversed.
The script creates an XML document object, and using this object, creates an HTML element named “tmp”. Next, the script writes the decoded second layer from the variable assigned to secondLayerEncoded into the created element. It then reads it back as a “binaryStream” and finally runs it using eval().
After decoding and prettifying the second layer, the result looks like this:
The second layer code reveals that it checks if .NET version 4.0.30319 framework is installed in the system, then proceeds to decode the malware payload (the Base64 strings shown in Figure 2 at line 2494). Afterward, it creates a memory stream object to where it writes the decoded Base64 payload. To run it, it uses the Deserialize_2 method of the System.Runtime.Serialization.Formatters.Binary.BinaryFormatter COM object to load managed code via object Deserialization. When invoked, it creates an instance of “jSfMMrZfotrr” – a class from the malicious .NET DLL loader.
BlackByte: De-obfuscating the Code
The BlackByte binary itself is also heavily obfuscated, both the code and the strings.
The code obfuscation needed some manual refactoring, and it proved to be tedious!
Below is a snippet of the most common code obfuscation technique we found in BlackByte’s code:
In this function, we can remove the if condition in line 7 since it is always true:
9. if ((46945 ^ 472736) == 491969) |
And in line 8, since sizeof(double) equals 8, our variable arg_46_0 will be equal to
-9992+8+9984 which is equals zero. So, we can refactor the code in line 10 like this:
13. Environment.Exit(arg_46_0); // is the same as Environment.Exit(0) |
To make it readable, we rename the function and removing all unnessary code, it would look like this:
1. internal static void kill_process() |
The same obfuscation technique has been used throughout the code. So, we can painstakingly and manually refactor every function to make it readable.
BlackByte: De-obfuscating the Strings
Another hurdle for analyzing this ransomware is the string obfuscation.
In the image above, each encrypted string is declared inside a public static object. The call to the method aCDscCCxGvmZ.k(encryptedString) is a call to a string reversal function, where it reverses the chunk of a Base64 string and then afterward joins those chunks together to form a complete Base64 encoded string.
Let’s take for example this encoded string:
public static object P() { |
First step is to reverse each chunk:
AAAACL+BAAAgD -> DgAAAB+LCAAAA |
K95vZqTDABAAA -> AAABADTqZv59K |
YbZietcdo57Pk -> kP75odcteiZbY |
AAAAOQDrIJDAC -> CADJIrDQOAAA |
Then join all together to form a complete Base64 string:
DgAAAB+LCAAAAAAABADTqZv59KkP75odcteiZbYCADJIrDQOAAAA |
The decoded base64 string is a GZip header starting at the 5th byte.
The first four bytes of the data are the length of the decoded string. So, we can remove the first four bytes, then apply GZIP decompression to the remaining data.
The next step is to decrypt the output with RC4 algorithm with the key [0xCD 0x92 0xCC 0x93 0xCD 0x98]. And finally, we get the decoded string “powershell.exe”
A CyberChef recipe below can help you with the string decoding. It accepts the whole obfuscated string function, parses the encoded string and decode it:
"args": ["User defined", "\"(.*?)\"", true, true, false, false, false, false, "List capture groups"] },
{ "op": "Fork",
"args": ["\\n", "", false] },
{ "op": "Reverse",
"args": ["Character"] },
{ "op": "Merge",
"args": [] },
{ "op": "From Base64",
"args": ["A-Za-z0-9+/=", true] },
{ "op": "To Hex",
"args": ["None", 0] },
{ "op": "Find / Replace",
"args": [{ "option": "Regex", "string": "^\\w{8}" }, "", true, false, true, false] },
{ "op": "From Hex",
"args": ["Auto"] },
{ "op": "Gunzip",
"args": [] },
{ "op": "To Hex",
"args": ["Space", 0] },
{ "op": "RC4",
"args": [{ "option": "Hex", "string": "CD 92 CC 93 CD 98" }, "Hex", "Latin1"] }
]
|
CyberChef came in handy when analyzing this malware. But a scripting tool like Python can make the de-obfuscation process faster. I’ll leave that as an exercise:
To end this blog, we'll leave some tips on how to approach obfuscated code like this:
- Analyze the code first and see what methods it uses.
- Find any string blobs, that may be a result of encryption or encoding. This may be data, a series of hex bytes, or a base64 string. Look for any references to this and follow through.
- For scripts, keep an eye on those evaluation expressions, we are talking about eval(). You can sometimes exploit this by replacing it with alert(), msgbox(), console.log(), or a file write operation. And let the script run and see what it prints, however, this is extremely dangerous, so run it in a VM environment.
- Learn some encoding and encryption algorithms. Base64, RC4, AES, RSA, or even the simplest bitwise operations like XOR and ROTATE will come in handy.
- And lastly, use a tool and debug it. It makes you understand how it works when you follow the code.
For anyone interested, a decompiled source of BlackByte that we have partially de-obfuscated can be downloaded from this Github link:
ABOUT TRUSTWAVE
Trustwave is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.